Evaluating large-scale Knowledge Resources across Languages

نویسندگان

  • Montse Cuadros
  • German Rigau
چکیده

This paper presents an empirical evaluation in a multilingual scenario of the semantic knowledge present on publicly available large-scale knowledge resources. The study covers a wide range of manually and automatically derived large-scale knowledge resources for English and Spanish. In order to establish a fair and neutral comparison, the knowledge resources are evaluated using the same method on two Word Sense Disambiguation tasks (Senseval-3 English and Spanish Lexical Sample Tasks). First, this study empirically demonstrates that the combination of the knowledge contained in these resources surpass the most frequent sense classifier for English. Second, we also show that this large-scale topical knowledge acquired from one language can be successfully ported to other languages.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Harmonised large-scale syntactic/semantic lexicons: a European multilingual infrastructure

The paper aims at providing an overview of the situation of Language Resources (LR) in Europe, in particular as emerging from a few European projects regarding the construction of large-scale harmonised resources to be used for many applicative purpose, also of multilingual nature. An important research aspect of the projects is given by the very fact that the large enterprise described is, at ...

متن کامل

Lexical Coverage Evaluation of Large-scale Multilingual Semantic Lexicons for Twelve Languages

The last two decades have seen the development of various semantic lexical resources such as WordNet (Miller, 1995) and the USAS semantic lexicon (Rayson et al., 2004), which have played an important role in the areas of natural language processing and corpus-based studies. Recently, increasing efforts have been devoted to extending the semantic frameworks of existing lexical knowledge resource...

متن کامل

Orthographic and Phonological Neighborhood Databases across Multiple Languages.

The increased globalization of science and technology and the growing number of bilinguals and multilinguals in the world have made research with multiple languages a mainstay for scholars who study human function and especially those who focus on language, cognition, and the brain. Such research can benefit from large-scale databases and online resources that describe and measure lexical, phon...

متن کامل

Knowledge-Based Multilingual Document Analysis

The growing availability of multilingual resources, like EuroWordnet, has recently inspired the development of large scale linguistic technologies, e.g. multilingual IE and Q&A, that were considered infeasible until a few years ago. In this paper a system for categorisation and automatic authoring of news streams in different languages is presented. In our system, a knowledge-based approach to ...

متن کامل

Large Scale Translation Quality Estimation

This study explores methods for developing a large scale Quality Estimation framework for Machine Translation. We expand existing resources for Quality Estimation across related languages by using different transfer learning methods. The transfer learning methods are: Transductive SVM, Label Propagation and Self-taught Learning. We use transfer learning methods on the available labelled dataset...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2007